Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor database session handling to async with retry logic, remove middleware #152

Merged
merged 66 commits into from
Oct 14, 2024

Conversation

ahdamin
Copy link
Contributor

@ahdamin ahdamin commented Sep 24, 2024

The PR refactors database session management to improve the performance and avoid continuous PendingRollbackError raised after the loss of database connection Database connection error: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query')

Context & design choices

Since genotype-api utilizes FastAPI, which is asynchronous by design, that makes it a bit sensitive to session sharing issues. Multiple requests in an asynchronous environment can overlap and potentially share the session and may lead to issues such as PendingRollbackError.
The dependency injection for database sessions was enough for FastAPI. In Flask, middleware with a global session works because each request runs in its own thread, and the session is scoped to that thread.

When using SQLAlchemy's AsyncSession, Lazy Loading breaks because it does not natively use await, and you may encounter MissingGreenlet. The eager loading was implemented using the selectinload method to avoid the error, without affecting the logic of the service/endpoint.

In some cases, when implementing async queries, an explicit join condition was added because the code switched from accessing the object’s relationship to using a query.

Fixed

Handles lost database connection

@clingen-sthlm clingen-sthlm temporarily deployed to stage September 24, 2024 14:21 Inactive
@clingen-sthlm clingen-sthlm temporarily deployed to stage September 26, 2024 11:11 Inactive
@clingen-sthlm clingen-sthlm temporarily deployed to stage September 26, 2024 11:12 Inactive
@clingen-sthlm clingen-sthlm temporarily deployed to stage September 26, 2024 12:19 Inactive
@clingen-sthlm clingen-sthlm temporarily deployed to stage September 26, 2024 15:07 Inactive
@clingen-sthlm clingen-sthlm temporarily deployed to stage September 27, 2024 09:56 Inactive
@clingen-sthlm clingen-sthlm temporarily deployed to stage September 30, 2024 09:52 Inactive
@clingen-sthlm clingen-sthlm temporarily deployed to stage September 30, 2024 14:56 Inactive
@ahdamin ahdamin force-pushed the fix-lost-db-connection branch from fd14a5b to cafb128 Compare October 2, 2024 08:17
Copy link
Contributor

@ChrOertlin ChrOertlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the in depth explanation, I learned from that!

Just curious about the async and await propagation. Do we need that throughout the whole code or is it enough to have it in the api when it calls the top level function of a multi layer call chain?

I.e. would this be enough?

async def get_something():
       await do_the_get()
       return

## the do the get
def do_the_get():
      return the_gotten

instead of also having the async and await in the do_the_get()

Edit: I asked GPT seems to be dependent on the underlying function and whether their operation are synchronous or asynchronous of nature.

@ahdamin
Copy link
Contributor Author

ahdamin commented Oct 9, 2024

Just curious about the async and await propagation. Do we need that throughout the whole code or is it enough to have it in the api when it calls the top level function of a multi layer call chain?

I.e. would this be enough?

async def get_something():
       await do_the_get()
       return

## the do the get
def do_the_get():
      return the_gotten

instead of also having the async and await in the do_the_get()

Edit: I asked GPT seems to be dependent on the underlying function and whether their operation are synchronous or asynchronous of nature.

I understand your point, and it would be simpler to use only async and await at the API level. However, as you mentioned, it depends on the nature of each function. If a function performs async operations, then both the function itself and any functions calling it need to be marked async and use await.

@Vince-janv
Copy link
Contributor

I understand your point, and it would be simpler to use only async and await at the API level. However, as you mentioned, it depends on the nature of each function. If a function performs async operations, then both the function itself and any functions calling it need to be marked async and use await.

I'm not convinced adding async/await to each function and return is necessary. It feels weird that to use FastAPI one needs to drastically change the syntax in every python function. Should it not be added only to the endpoints which take a lot of time?

@ahdamin
Copy link
Contributor Author

ahdamin commented Oct 9, 2024

I'm not convinced adding async/await to each function and return is necessary. It feels weird that to use FastAPI one needs to drastically change the syntax in every python function. Should it not be added only to the endpoints which take a lot of time?

Of course, it is not necessary to change every Python function. Async operations are related to IO operations such as database operations, file operations, etc. Some functions were not changed here.
Can we go for a synchronous FastAPI application? Of course, we can! The question is why?

@clingen-sthlm clingen-sthlm temporarily deployed to stage October 10, 2024 23:17 Inactive
@clingen-sthlm clingen-sthlm temporarily deployed to stage October 11, 2024 14:26 Inactive
@clingen-sthlm clingen-sthlm temporarily deployed to stage October 11, 2024 14:42 Inactive
@clingen-sthlm clingen-sthlm temporarily deployed to stage October 11, 2024 14:59 Inactive
Copy link

@clingen-sthlm clingen-sthlm temporarily deployed to stage October 11, 2024 15:34 Inactive
@ahdamin ahdamin merged commit 43d8681 into main Oct 14, 2024
5 checks passed
@ahdamin ahdamin deleted the fix-lost-db-connection branch October 14, 2024 07:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Genotype pending database transactions cause the webui to crash
5 participants